Non - Linear Algorithms for Parametric Markov Programming

نویسندگان

  • Boris S. Verkhovsky
  • Yuriy S. Polyakov
چکیده

General idea of nonlinear correcting factors that was successfully applied to accelerate and derive algorithms for many linear and nonlinear problems is used to develop several efficient procedures for solving linear systems of equations based on the Jacobi algorithm. Results of the theoretical study and extensive computer experiments for stochastic matrices are presented and analyzed to determine the conditions under which the studied algorithms converge, and the areas of their maximum convergence rate are identified. Applications of the algorithms to Markov decision processes are discussed. Many problems in computer science and operations research can be formulated as Markov Decision Processes (MDP) [1]. Examples include routing, optimal stopping, target, replacement, maintenance and repair, and inventory problems as well as optimal control of queues and stochastic scheduling. The utility function of total expected discounted rewards is commonly used in MDPs with finite and action spaces [1]. In this case, the optimality equation takes the following form: ( ) ( ) max V R S V α α ω α = ⎡ + ⎤ ⎣ ⎦ , (1) where V is a value (state) vector, ( ) R α is a reward vector, ω is a scalar discount factor, ( ) S α is a transition probability matrix, and α is a policy (control vector). The objective is to find the optimal policy α and corresponding value vector V, which represents the maximum expected discounted sum of future rewards. The main approaches to solving problem (1) are policy iteration, value iteration, and linear programming [1]. Both policy iteration and value iteration methods strongly depend on the efficiency of algorithms to solve a linear system of equations (LSE). Policy iteration algorithm solves equation ( ) ( ) ( ) ( ) i i V R S V α ω α = + , (2) for every iteration i of policy α . Value iteration algorithm calculates estimates for next value vector iterates using equation ( ) ( ) ( ) ( ) 1 max i i V R S V α α ω α + ⎡ ⎤ = + ⎣ ⎦ . (3) The main approaches to accelerate the convergence of algorithm (3) are similar to those for traditional LSE: Gauss-Seidel, Successive Overrelaxation (SOR), etc. [1]. As a result, faster LSE algorithms can help accelerate existing optimization algorithms and broaden the application area of MDPs. Mathematically a linear system of equations can be formulated as follows: x = Ax + b, (4) where A is a known coefficient matrix, b is a known vector, and x is an unknown solution vector. Two major directions to solving LSE are commonly recognized: direct and iterative [2]. Direct methods [3] often involve factorization, such as Gaussian elimination, and forward and backward substitution on the vector b.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the optimization of Dombi non-linear programming

Dombi family of t-norms includes a parametric family of continuous strict t-norms, whose members are increasing functions of the parameter. This family of t-norms covers the whole spectrum of t-norms when the parameter is changed from zero to infinity. In this paper, we study a nonlinear optimization problem in which the constraints are defined as fuzzy relational equations (FRE) with the Dombi...

متن کامل

Presentation and Solving Non-Linear Quad-Level Programming Problem Utilizing a Heuristic Approach Based on Taylor Theorem

The multi-level programming problems are attractive for many researchers because of their application in several areas such as economic, traffic, finance, management, transportation, information technology, engineering and so on. It has been proven that even the general bi-level programming problem is an NP-hard problem, so the multi-level problems are practical and complicated problems therefo...

متن کامل

A new non-parametric approach for suppliers selection

In this paper we propose a simple non-parametric model for multiple crite-ria supplier selection problem. The proposed model does not generate a zeroweight for a certain criterion and ranks the suppliers without solving the modeln times (one linear programming (LP) for each supplier) and therefore allowsthe manager to get faster results. The methodology is illustrated using anexample.

متن کامل

A Parametric Approach for Solving Multi-Objective Linear Fractional Programming Phase

In this paper a multi - objective linear fractional programming problem with the fuzzy variables and vector of fuzzy resources is studied and an algorithm based on a parametric approach is proposed. The proposed solving procedure is based on the parametric approach to find the solution, which provides the decision maker with more complete information in line with reality. The simplicity of the ...

متن کامل

An Application of the ABS LX Algorithm to Multiple Sequence Alignment

We present an application of ABS algorithms for multiple sequence alignment (MSA). The Markov decision process (MDP) based model leads to a linear programming problem (LPP), whose solution is linked to a suggested alignment. The important features of our work include the facility of alignment of multiple sequences simultaneously and no limit for the length of the sequences. Our goal here is to ...

متن کامل

Linear programming on SS-fuzzy inequality constrained problems

In this paper, a linear optimization problem is investigated whose constraints are defined with fuzzy relational inequality. These constraints are formed as the intersection of two inequality fuzzy systems and Schweizer-Sklar family of t-norms. Schweizer-Sklar family of t-norms is a parametric family of continuous t-norms, which covers the whole spectrum of t-norms when the parameter is changed...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007